Genomic Survey of Flavin Monooxygenases in Wild and Cultivated Rice Provides Insight into Evolution and Functional Diversities

The flavin monooxygenase (FMO) enzyme was discovered in mammalian liver cells that convert a carcinogenic compound, N-N′-dimethylaniline, into a non-carcinogenic compound, N-oxide. Since then, many FMOs have been reported in animal systems for their primary role in the detoxification of xenobiotic compounds. In plants, this family has diverged to perform varied functions like pathogen defense, auxin biosynthesis, and S-oxygenation of compounds. Only a few members of this family, primarily those involved in auxin biosynthesis, have been functionally characterized in plant species. Thus, the present study aims to identify all the members of the FMO family in 10 different wild and cultivated Oryza species. Genome-wide analysis of the FMO family in different Oryza species reveals that each species has multiple FMO members in its genome and that this family is conserved throughout evolution. Taking clues from its role in pathogen defense and its possible function in ROS scavenging, we have also assessed the involvement of this family in abiotic stresses. A detailed in silico expression analysis of the FMO family in Oryza sativa subsp. japonica revealed that only a subset of genes responds to different abiotic stresses. This is supported by the experimental validation of a few selected genes using qRT-PCR in stress-sensitive Oryza sativa subsp. indica and stress-sensitive wild rice Oryza nivara. The identification and comprehensive in silico analysis of FMO genes from different Oryza species carried out in this study will serve as the foundation for further structural and functional studies of FMO genes in rice as well as other crop types.


Introduction
The discovery of flavin monooxygenases (FMOs) traces back to the 1960s in mammalian hepatic microsomes, where it was needed to convert N-N -dimethylaniline into N-oxide [1]). N-N -dimethylaniline is a tertiary amine compound that has potential carcinogenic activity. Therefore, it requires detoxification by the liver cell [2]. FMOs belong to one of the protein families needed for detoxification of the carcinogens in the cell [3][4][5][6][7][8].
FMOs are found in all domains of life and metabolize several xenobiotic compounds like toxins, several drugs, and pesticides. They need cofactors flavin adenine dinucleotide (FAD), nicotinamide adenine dinucleotide phosphate (NADPH), and dioxygen for their activity [7,[9][10][11][12][13]. FMO catalyzes the transfer of hydroxyl groups to small, nucleophilic substrates with heteroatoms like sulfur, nitrogen, iodine, and selenium, thus making them polar and therefore easily excretable from the cell [7,14]. Yeast has a single FMO [15], and animals have five FMOs [16][17][18][19]. The first plant FMO, YUCCA, was reported in Arabidopsis in 2001, after almost 40 years of its discovery [20]. YUCCA catalyzes the rate-limiting step Our lab has been working on identification and characterization of several gene families that contribute to tolerance towards abiotic stresses [64,[69][70][71][72][73]. In this study, we have explored the presence of the FMO family of genes in 10 genomes of rice (both wild and cultivated) to understand the evolution, conservation and functional diversification of the FMO family. We have also investigated the possible involvement of this family in providing tolerance to different abiotic stresses using publicly available expression datasets and further experimental validation.

The Number of FMO Genes Varies in Wild and Cultivated Species of Rice
The hidden Markov model (HMM) profile (PF00743) search was done against the Gramene [74] database for different Oryza species (Oryza brachyantha, Oryza punctata, Oryza meridionalis, Oryza glumaepatula, Oryza glaberrima, Oryza barthii, Oryza nivara, Oryza sativa subsp. indica, Oryza rufipogon and Oryza sativa subsp. japonica). We found that the number of FMO genes, having conserved binding motifs for FAD and NADPH, varies in cultivated and wild rice species. There are 25 Table 1 lists the FMO genes found in various species and these species have been arranged according to their evolution. Among these species, O. brachyantha is the most distant, while O. sativa subsp. indica is most recent one. The nomenclature of various genes in different Oryza species has been assigned according to their appearance on the chromosomes in ascending order, prefixed by the genus and species name.

The Distribution of FMO Genes on Chromosomes Reveals the Presence of Gene Clusters
Information regarding the coordinates of FMO genes belonging to different Oryza species was obtained from the Gramene database and genes from all the species were mapped on the rice chromosomes using MapGene2Chromosome v2 [75]. We found that FMO genes are mainly distributed on 11 out of 12 rice chromosomes. Unlike other chromosomes, Chromosome (Chr) number 8 was the least populated, consisting of single FMO gene from O. brachyantha (ObrFMO18) and O. nivara (OnFMO14) (Figure 1). Among these 12 chromosomes, the maximum members of a single species are present on Chr 1 and Chr 4 and the lowest are found on Chr 5 and Chr 8. Furthermore, careful observation of chromosomal distribution of FMO genes revealed that the majority of the orthologous FMO genes in different Oryza species were found in the same order on a particular chromosome. Also, it is noteworthy that some FMO genes form clusters as they are adjacent to each other on the chromosomes. For example, in Oryza sativa subsp. japonica OsJFMO7 and OsJFMO8 forms a cluster on Chr 3, OsJFMO13 and OsJFMO14 on Chr 6, OsJFMO16, OsJFMO17, OsJFMO18, and OsJFMO19 on Chr 7, OsJFMO22 and OsJFMO23 on Chr 9, OsJFMO24 and OsJFMO25 on Chr 10. Further, OsJFMO26 and OsJFMO27 forms a cluster on Chr 11.

Phylogenetic Analysis Shows Conservation among FMO Family and the Majority of Them Belongs to the S-Oxygenating Clade in Different Oryza Species
To determine the evolutionary and phylogenetic relationships among FMO proteins in different Oryza species, a rooted tree was constructed using MEGA7 software [76] and the proteins were grouped into three different clades based on their putative functions ( Figure 2). Clade I was assigned as a pathogen defense clade, Clade II as an auxin biosynthesis clade, and Clade III as an S-oxygenating clade based on its constituents' homology to known and characterized Arabidopsis members [14,21]. We also noticed that OnFMO10 is the outlier in the phylogenetic tree ( Figure 2). This might be because of its lower homology to all the other

Phylogenetic Analysis Shows Conservation among FMO Family and the Majority of Them Belongs to the S-Oxygenating Clade in Different Oryza Species
To determine the evolutionary and phylogenetic relationships among FMO proteins in different Oryza species, a rooted tree was constructed using MEGA7 software [76] and the proteins were grouped into three different clades based on their putative functions ( Figure 2). Clade I was assigned as a pathogen defense clade, Clade II as an auxin biosynthesis clade, and Clade III as an S-oxygenating clade based on its constituents' homology to known and characterized Arabidopsis members [14,21]. We also noticed that OnFMO10 is the outlier in the phylogenetic tree ( Figure 2). This might be because of its lower homology to all the other protein sequences of the Oryza species.
(orange strip), 1 in the nucleus (blue strip), 1 in the peroxisome (purple strip), and 1 in the endoplasmic reticulum (dark green strip). However, in O. sativa subsp. indica, 13 proteins were found in the chloroplast, 11 in the cytoplasm, 4 in the plastid and 1 in the ER. In wild rice species, besides these organelles, very few FMO proteins were also found to be localized in the vacuole (cyan strip), cytoskeleton (peach strip) and extracellular compartment (yellow strip). Based on the results of multiple sequence alignment by ClustalW and the phylogenetic tree, orthologs from different species were identified. These are represented in Table 2.

Figure 2. Evolutionary and sub-cellular localization analysis of FMO genes in wild and cultivated rice.
A rooted circular phylogenetic tree, depicting evolutionary connection between the FMO encoding genes in wild and cultivated rice, was determined by maximum likelihood method with 1000 bootstrap replicates using MEGA 7.0 and visualised using iTOL. The subcellular localization of the genes has been indicated by a color strip around the tree. Red-cytoplasm, parrot green-chloroplast, dark green-endoplasmic reticulum, orange-plastid, blue-nucleus, cyan-vacuole, pinkgolgi, peach-cytoskeleton, yellow-extracellular space. The branch lengths indicate the evolutionary time between the two nodes. Table 2. Distribution of FMO genes in 10 genomes of rice. The chromosome-wise distribution of the identified FMO family members in the genomes of ten Oryza species, arranged according to their evolutionary history. O. brachyantha is the most distant and O. sativa subp. japonica is the most recent among these species. Each row contains list of genes that are orthologs of each other in different Oryza species. The blank space denotes the absence of the corresponding ortholog in the respective

Figure 2. Evolutionary and sub-cellular localization analysis of FMO genes in wild and cultivated rice.
A rooted circular phylogenetic tree, depicting evolutionary connection between the FMO encoding genes in wild and cultivated rice, was determined by maximum likelihood method with 1000 bootstrap replicates using MEGA 7.0 and visualised using iTOL. The subcellular localization of the genes has been indicated by a color strip around the tree. Red-cytoplasm, parrot green-chloroplast, dark green-endoplasmic reticulum, orange-plastid, blue-nucleus, cyan-vacuole, pink-golgi, peach-cytoskeleton, yellow-extracellular space. The branch lengths indicate the evolutionary time between the two nodes.
The localization of FMO proteins was predicted using WolfPsort software [77]. A maximum number of proteins were found to be localized in the, cytoplasm followed by chloroplast ( Figure 2). In Oryza sativa subsp. japonica, 11 proteins were found to be localized in the chloroplast (light green strip), 10 in the cytoplasm (red strip), 4 in the plastid (orange strip), 1 in the nucleus (blue strip), 1 in the peroxisome (purple strip), and 1 in the endoplasmic reticulum (dark green strip). However, in O. sativa subsp. indica, 13 proteins were found in the chloroplast, 11 in the cytoplasm, 4 in the plastid and 1 in the ER. In wild rice species, besides these organelles, very few FMO proteins were also found to be localized in the vacuole (cyan strip), cytoskeleton (peach strip) and extracellular compartment (yellow strip). Based on the results of multiple sequence alignment by ClustalW and the phylogenetic tree, orthologs from different species were identified. These are represented in Table 2. Table 2. Distribution of FMO genes in 10 genomes of rice. The chromosome-wise distribution of the identified FMO family members in the genomes of ten Oryza species, arranged according to their evolutionary history. O. brachyantha is the most distant and O. sativa subp. japonica is the most recent among these species. Each row contains list of genes that are orthologs of each other in different Oryza species. The blank space denotes the absence of the corresponding ortholog in the respective species. * Indicates the presence of an ortholog that belongs to some different chromosome in a particular row.

Protein Motif Analysis Reveals Five Conserved Motifs in FMO Members from Different Oryza Species
Following the phylogenetic analysis of FMO members in different Oryza species, we analyzed the conservation of different motifs, 5 of which are signatory conserved motifs of FMOs. It has been reported that FMO binds the adenine motif of FAD with a conserved GxGxxG sequence which is the FAD-binding motif found at the N terminus of the protein [78]. An identical (GxGxxG) but less conserved motif, the NADPH-binding motif is located at the centre of the protein [79]. Another motif, ATG containing motif (ATGY) is present at the C-terminus and occurs in proteins that carry out N-oxidation. Finally, a conserved motif FxGxxHxxY/F, FMO-identifying sequence is found in all known plant FMOs. It can be distributed anywhere in the protein [79]. We found that orthologs from different Oryza species exhibit high conservation in protein motifs ( Figure 3). Variation in protein motif conservation in some orthologs from Oryza brachyantha has been observed as it is the most distant ancestor among these species. We also checked the presence of signatory conserved motifs of the FMO family.  In our analysis, we found that almost all of these motifs were present in the protein sequences of different Oryza species. However, few of them lack these motifs. The FAD-binding motif was mostly present at the N terminus of the protein, while the ATGcontaining motif was present at the C terminus. Also, it was observed that ATG-binding motif was present twice in some of the proteins. Besides these, 5 more conserved motifs were seen in these sequences, which are named motif 5, motif 7, motif 8, motif 9, and motif 10 in this study. Again, OnFMO10 ( Figure 4B(18)) shows the difference in protein motif arrangement from the rest of the members, with only 3 motifs present in total, NADPHcontaining motif, motif 5, and motif 9. OgFMO16 ( Figure 4B (17)) shows the presence of only 1 motif, the NADPH-binding motif, and OsIFMO16 shows the presence of two motifs, NADPH and FMO-identifying motifs ( Figure 4A,B). This is probably because this gene has not been sequenced properly and the available sequence in the Gramene database has missing 5 bases, which are denoted as a stretch of N in the database. In the 5th and 30th sets ( Figure 4A

Analysis of Gene Architecture Depicts Conservation among Orthologs from Different Oryza Species
The exon-intron arrangement for the genes encoding the FMO proteins in different Oryza species was evaluated using the Gene Structure Display Server tool [80]. The gene structures of the orthologous genes have been represented together in Figure 5. The orthologs were largely found to share similar exon-intron arrangements, with a few exceptions. OnFMO3 possessed an exceptionally long 3 UTR, while OmFMO5 had a long 5 UTR ( Figure 5A(2,3)). The exon-intron arrangement of ObaFMO2 was found to be completely different from that of the other members in the set, consisting of eight exons, while the rest comprised one to a maximum of three exons ( Figure 5A(4)). In the fifth set, ObrFMO2 had longer introns as compared to OgFMO2 and OpFMO2 ( Figure 5A(5)) A similar observation was also made for ObrFMO5, ObaFMO10, OpFMO11 and OrFMO15 and ( Figure 5A(7),B (11,15),C (19)

Domain Assessment Reveals Loss of Extra Copies of FAD Domain during the Course of Domestication
To assess the domain architecture of the identified putative FMO proteins, analysis was done using the SUPERFAMILY database [81]. The protein sequences were found to harbour single-to-multiple copies of the FAD/NADP-binding domain. Additionally, a nucleotide-binding domain and retroviral domains were also found in a few of the sequences like OgFMO19, OgFMO2, ObrFMO25, and ObaFMO2, respectively. While all the FMO family proteins of the cultivated rice varieties, both indica and japonica, consisted of single-to-double repeats of the domain, wild relatives of rice mostly contained two-to-five repeats of the domain such as OpFMO11, OpFMO17, OrFMO15, OrFMO18, OnFMO18, OgFMO19. OnFMO18, and OgFMO19. In fact, OmFMO7 was found to harbor eight copies of the domain, the most among all the proteins in wild vs cultivated varieties. The presence of multiple domains also correlated with the sequence length of the proteins. Interestingly, we also found a few of the domains to be arranged in such a way that they overlapped with the adjacent domain repeat, while others were split because they were linked In the nineteenth set, ObaFMO16, OnFMO18, and OrFMO15 consisted of 8-10 exons, while the rest were either intronless or consisted of a maximum of two exons. Another example of this arrangement would be the twentieth set, wherein OgFMO19, OmFMO7 and OrFMO16 contained varying numbers of exons ranging from 4-8, while others consisted of two exons or no introns at all ( Figure 5C(19,20)). In the sixteenth set, all the genes were intronless except ObaFMO15 ( Figure 5C(16)). In the twenty-eight set of orthologs, ObaFMO24, OmFMO27, OrFMO24 and OglFMO24 consisted of longer introns and five exons, one more than the other genes in the same set ( Figure 5D(28)). Overall, it can be seen that the FMO gene family has variable sizes of exon, intron, and UTRs.

Domain Assessment Reveals Loss of Extra Copies of FAD Domain during the Course of Domestication
To assess the domain architecture of the identified putative FMO proteins, analysis was done using the SUPERFAMILY database [81]. The protein sequences were found to harbour single-to-multiple copies of the FAD/NADP-binding domain. Additionally, a nucleotide-binding domain and retroviral domains were also found in a few of the sequences like OgFMO19, OgFMO2, ObrFMO25, and ObaFMO2, respectively. While all the FMO family proteins of the cultivated rice varieties, both indica and japonica, consisted of single-to-double repeats of the domain, wild relatives of rice mostly contained two-to-five repeats of the domain such as OpFMO11, OpFMO17, OrFMO15, OrFMO18, OnFMO18, OgFMO19. OnFMO18, and OgFMO19. In fact, OmFMO7 was found to harbor eight copies of the domain, the most among all the proteins in wild vs cultivated varieties. The presence of multiple domains also correlated with the sequence length of the proteins. Interestingly, we also found a few of the domains to be arranged in such a way that they overlapped with the adjacent domain repeat, while others were split because they were linked together by a linker or spacer region ( Figure 6).

Developmental and Stress-Mediated Expression Profiling of FMO Genes Reveals Tissue Specificity and Perturbations in a Subset of Genes in Different Abiotic Stresses
Next, we analyzed the expression profiling of the genes encoding the FMO family of proteins in the cultivated variety Oryza sativa subsp. japonica. Normalized and curated transcript abundance data were retrieved from the publicly available Genevestigator database. Of the 28 genes, expression data for 6 genes viz, OsJFMO2 (Os01g0368000), OsJFMO12 (Os05g0528600), OsJFMO17 (Os07g0111900), OsJFMO18 (Os07g0112000), Os-JFMO26 (Os11g0207700), OsJFMO27 (Os11g0207900) could not be retrieved due to data unavailability.
The spatial expression profiling revealed differential expression of the genes in the different tissues, namely endosperm, embryo, seedling, culm, leaf, flag leaf, panicle, spikelet, and root. OsJFMO6 and OsJFMO28 were found to be highly expressed only in the endosperm. Most of the other genes such as OsJFMO5, OsJFMO8, OsJFMO14, OsJFMO19, OsJFMO22, OsJFMO23, OsJFMO3, OsJFMO4, OsJFMO20, and OsJFMO7 showed low to medium level of expression in all the tissues. Interestingly, the transcript abundance of only one gene, OsJFMO24, was maintained across tissues. OsJFMO21 and OsJFMO9 showed medium-to-high levels of expression in all the tissues. In fact, OsFMO9 showed maximum expression (14.36-fold change) in roots as compared to other genes. OsJFMO9 and OsJFMO10 were found to be highly expressed in the panicle (11.66-and 10.82-fold, respectively) and spikelet (11.95-and 11-fold, respectively). OsJFMO9 was highly induced (13.33-fold change) in the roots of the seedling. OsJFMO25 and OsJFMO9 showed higher expression (11.26-, 11.04-, and 10.79-fold) in the embryo. ( Figure 7A).

Developmental and Stress-Mediated Expression Profiling of FMO Genes Reveals Tissue Specificity and Perturbations in a Subset of Genes in Different Abiotic Stresses
Next, we analyzed the expression profiling of the genes encoding the FMO family of proteins in the cultivated variety Oryza sativa subsp. japonica. Normalized and curated transcript abundance data were retrieved from the publicly available Genevestigator database. Of the 28 genes, expression data for 6 genes viz, OsJFMO2 (Os01g0368000), OsJFMO12 (Os05g0528600), OsJFMO17 (Os07g0111900), OsJFMO18 (Os07g0112000), OsJFMO26 (Os11g0207700), OsJFMO27 (Os11g0207900) could not be retrieved due to data unavailability.
The spatial expression profiling revealed differential expression of the genes in the different tissues, namely endosperm, embryo, seedling, culm, leaf, flag leaf, panicle, spikelet, and root. OsJFMO6 and OsJFMO28 were found to be highly expressed only in the endosperm. Most of the other genes such as OsJFMO5, OsJFMO8, OsJFMO14, OsJFMO19, OsJFMO22, OsJFMO23, OsJFMO3, OsJFMO4, OsJFMO20, and OsJFMO7 showed low to medium level of expression in all the tissues. Interestingly, the transcript abundance of only one gene, OsJFMO24, was maintained across tissues. OsJFMO21 and OsJFMO9 showed medium-to-high levels of expression in all the tissues. In fact, OsFMO9 showed maximum expression (14.36-fold change) in roots as compared to other genes. OsJFMO9 and OsJFMO10 were found to be highly expressed in the panicle (11.66-and 10.82-fold, respectively) and spikelet (11.95-and 11-fold, respectively). OsJFMO9 was highly induced  Figure 7A). Further, we explored the expression profiling of the FMO genes under various stress conditions such as salinity, heat, cold, drought, and submergence. All expression values have been denoted as log2 fold change. Out of 28, 10 genes show little or no perturbations under various stresses examined. These are OsJFMO4, OsJFMO5, OsJFMO6, OsJFMO7, OsJFMO8, OsJFMO11, OsJFMO13, OsJFMO14, OsJFMO23, and OsJFMO28. However, 11 genes, OsJFMO1, OsJFMO3, OsJFMO9, OsJFMO10, OsJFMO15, OsJFMO16, OsJFMO19, OsJFMO20, OsJFMO21, OsJFMO24, and OsJFMO25 were found to be significantly repressed in one or the other stress. Only a few of them were found to be upregulated. For example, OsJFMO10 upregulation was highly induced under salinity (2.44-fold) stress. OsJFMO1 is the only gene found to be upregulated under drought stress. OsJFMO9 was found to be 5-6-fold upregulated under cold stress, the highest level among all the genes ( Figure 7B). Overall, it seems that stresses impose perturbation in a subset of genes among this family. The expression data of each gene has been added in the Supplementary Materials (Table S1). Next, we validated some of the genes encoding for the FMO family of proteins based on the publicly available stress-mediated expression profiling data. The genes selected belong to the three different clades. Leaves cut from one month old plants of the cultivated species Oryza sativa subsp. indica and its closest wild rice relative, Oryza nivara, were subjected to different stress conditions such as high temperature, salinity, and drought. In Oryza sativa subsp. indica, OsIFMO1, OsIFMO4, OsIFMO10, OsIFMO23 and OsIFMO24, five genes were found to be significantly upregulated under high temperature stress. However, none of the genes show significant upregulation under drought stress. A contrasting trend was observed in salinity stress where OsIFMO1, OsIFMO10, OsIFMO24 and OsIFMO10 shows downregulation (0.07 to −2.04-fold) ( Figure 8A). Under drought stress, all of the genes, OnFMO1, OnFMO17, OnFMO21, OnFMO4, and OnFMO20, were significantly up regulated in the wild relative O. nivara. In heat stress, four of them, OnFMO1, OnFMO17, OnFMO19, OnFMO21 and OnFMO22 shows upregulation. In salinity stress, Further, we explored the expression profiling of the FMO genes under various stress conditions such as salinity, heat, cold, drought, and submergence. All expression values have been denoted as log2 fold change. Out of 28, 10 genes show little or no perturbations under various stresses examined. These are OsJFMO4, OsJFMO5, OsJFMO6, OsJFMO7, OsJFMO8, OsJFMO11, OsJFMO13, OsJFMO14, OsJFMO23, and OsJFMO28. However, 11 genes, OsJFMO1, OsJFMO3, OsJFMO9, OsJFMO10, OsJFMO15, OsJFMO16, OsJFMO19, OsJFMO20, OsJFMO21, OsJFMO24, and OsJFMO25 were found to be significantly repressed in one or the other stress. Only a few of them were found to be upregulated. For example, OsJFMO10 upregulation was highly induced under salinity (2.44-fold) stress. OsJFMO1 is the only gene found to be upregulated under drought stress. OsJFMO9 was found to be 5-6-fold upregulated under cold stress, the highest level among all the genes ( Figure 7B). Overall, it seems that stresses impose perturbation in a subset of genes among this family. The expression data of each gene has been added in the Supplementary Materials (Table S1).
Next, we validated some of the genes encoding for the FMO family of proteins based on the publicly available stress-mediated expression profiling data. The genes selected belong to the three different clades. Leaves cut from one month old plants of the cultivated species Oryza sativa subsp. indica and its closest wild rice relative, Oryza nivara, were subjected to different stress conditions such as high temperature, salinity, and drought. In Oryza sativa subsp. indica, OsIFMO1, OsIFMO4, OsIFMO10, OsIFMO23 and OsIFMO24, five genes were found to be significantly upregulated under high temperature stress. However, none of the genes show significant upregulation under drought stress. A contrasting trend was observed in salinity stress where OsIFMO1, OsIFMO10, OsIFMO24 and OsIFMO10 shows downregulation (0.07 to −2.04-fold) ( Figure 8A). Under drought stress, all of the genes, OnFMO1, OnFMO17, OnFMO21, OnFMO4, and OnFMO20, were significantly up regulated in the wild relative O. nivara. In heat stress, four of them, OnFMO1, OnFMO17, OnFMO19, OnFMO21 and OnFMO22 shows upregulation. In salinity stress, OnFMO19 (log 2-fold change, 3.24) and OnFMO22 (log 2-fold change, 2.04) shows perturbation in a significant range ( Figure 8B). The ortholog of OnFMO22 i.e., OsIFMO27, was not detected in any of the stresses, thus indicating null expression of the gene at this developmental stage. This is the first report on experimentally validated expression profiling of FMO encoding genes from the wild rice O. nivara. Our expression profiling data was mostly, but not entirely, correlated with publicly available expression data of Oryza sativa subsp. OnFMO19 (log 2-fold change, 3.24) and OnFMO22 (log 2-fold change, 2.04) shows perturbation in a significant range ( Figure 8B). The ortholog of OnFMO22 i.e., OsIFMO27, was not detected in any of the stresses, thus indicating null expression of the gene at this developmental stage. This is the first report on experimentally validated expression profiling of FMO encoding genes from the wild rice O. nivara. Our expression profiling data was mostly, but not entirely, correlated with publicly available expression data of Oryza sativa subsp. japonica. OsIFMO10 and OsIFMO24 show a similar trend in heat and drought stress but an opposite trend in salt stress. Similarly, OsIFMO1 and OsIFMO4 show similar trends in drought and salinity, but opposite trends in heat stress. Thus, it can be concluded that, while FMO encoding genes in indica variety are induced by heat, their orthologs in O. nivara are induced by salinity and drought. Oryza nivara using qRT-PCR in response to heat stress (42 °C), salinity stress (200 mM NaCl) and drought stress (water withheld) for 24 h. Mean fold change (log2) is depicted, and the expression data is plotted against the untreated samples. The error bar represents the standard deviation where n = 6. *** signifies p value < 0.05 upto four or more decimal places and ** signifies p value < 0.05 for two decimal places.

Discussion
Plant FMOs are currently a significantly under-utilized class of enzymes, but studies in animal systems show that they have a rich potential for the discovery of new activities and novel bioactive metabolites. The evolutionary diversity of plant FMOs can provide a remarkable untapped resource of "green" biocatalysts [7]. Plant FMOs could provide unique breeding targets for subsistence agriculture based on their known important functions in plant metabolism. The identification of the entire complement of FMOs from Oryza species used in this study is a significant step forward in FMO research. The comparison of FMO genes from different Oryza species also provides useful information on evolutionary aspects of this gene family and shows conservation which clearly indicates the mandatory roles of FMOs, which cell cannot afford to change. However, all the substrates and reactions catalyzed by FMOs are yet to be identified.
The difference in the number of FMO genes in different Oryza species indicates the Oryza nivara using qRT-PCR in response to heat stress (42 • C), salinity stress (200 mM NaCl) and drought stress (water withheld) for 24 h. Mean fold change (log 2 ) is depicted, and the expression data is plotted against the untreated samples. The error bar represents the standard deviation where n = 6. *** signifies p value < 0.05 upto four or more decimal places and ** signifies p value < 0.05 for two decimal places.

Discussion
Plant FMOs are currently a significantly under-utilized class of enzymes, but studies in animal systems show that they have a rich potential for the discovery of new activities and novel bioactive metabolites. The evolutionary diversity of plant FMOs can provide a remarkable untapped resource of "green" biocatalysts [7]. Plant FMOs could provide unique breeding targets for subsistence agriculture based on their known important functions in plant metabolism. The identification of the entire complement of FMOs from Oryza species used in this study is a significant step forward in FMO research. The comparison of FMO genes from different Oryza species also provides useful information on evolutionary aspects of this gene family and shows conservation which clearly indicates the mandatory roles of FMOs, which cell cannot afford to change. However, all the substrates and reactions catalyzed by FMOs are yet to be identified.
The difference in the number of FMO genes in different Oryza species indicates the gain or loss of FMOs during the evolution of individual species. The size of gene families can increase by single gene duplications or duplication of a segment of chromosome i.e., segmental duplication. Gene family size can reduce by single gene deletion or deletion within a segmental piece [82,83]. The most distant ancestor among these species, O. brachyantha, has 25 FMO members, O. punctata has 22, and cultivated species O. sativa subsp. indica and O. sativa subsp. japonica have 28 and 29 genes, respectively. This increase in number in domesticated rice clearly indicates gene duplication events in their genome. There are a few non-functional FMOs in animal systems, such as human FMO6, which is a pseudogene [19,84]. Pseudogenes have missing promoters/deleted sequences/%frameshifts/a lesser number of introns or premature stop codons [85,86]. It is possible that a few of them are non-functional in rice as well. Across different Oryza species, a variable number of gene members of the jumonji C domain-containing protein family and the CBS-domaincontaining protein family have been observed, implying that change in gene family size is very common in evolution [64,87,88].
Tandem duplications, which are duplication of adjacent identical chromosome segments (exon(s)/gene) resulting in the formation of another gene or specific exon, occur frequently in the plant genome [89]. The presence of FMO gene clusters on different chromosomes might be a result of tandem duplication. Similar clusters were also observed in FMOs belonging to clade II in wheat [90]. The Arabidopsis receptor-like kinase (RLKs) family has been expanded via tandem duplication [91]. Furthermore, it has been reported that tandem duplication has resulted in genome size expansion in rice, which may have occurred specifically in the rice lineage as a response to different stresses [92]. These duplications within the gene families may benefit the species in terms of functional diversity and response. Also, the majority of the orthologous FMO gene family members are found on the same order on chromosomes, indicating that these orthologous genes are present as syntenic blocks in chromosomal regions.
Phylogenetic analysis reveals that the gene family is conserved across these species due to sequence similarity. Also, the presence of maximum FMO members in clade III indicates their role in S-oxygenation. In 2007, the first plant FMO belonging to this clade was isolated and characterised from Arabidopsis [21]. FMO GS-OX is one out of seven FMOs present in Arabidopsis from this clade [41]. This FMO catalyzes the S-oxygenation of methionine-derived glucosinolates (GSLs) like methylthioalkyl. GSLs are plant secondary metabolites that are present in cruciferous plants, especially brassicales. GSLs and their hydrolytic products are involved in various plant functions like defense against pathogens and herbivores [93]. FMOs belonging to rice and other plant species that do not produce GSLs, on the other hand, may be involved in the S-oxygenation of a wide range of sulfurcontaining compounds which are widespread in the plant kingdom [39]. However, the actual substrates of these FMOs are not known. The localization of FMOs in different intracellular compartments might provide functional diversity to this gene family by simultaneous physical segregation and the operation of various metabolic processes in the same cell.
The majority of FMO proteins from these species have conserved motifs that have been reported in the literature. The absence of some conserved motifs in a few FMO members again raises the possibility that these are non-functional pseudogenes but are related to other members of this family due to significant sequence similarity. Based on the absence of few conserved motifs, recent identification of NRL [NONPHOTOTROPIC HYPOCOTYL 3/ROOT PHOTOTROPISM 2-like (NPH3/RPT2-Like)] genes from rice have revealed the presence of two pseudogenes among the 27 that are present in the genome [94]. In fact, rice has many pseudogenes that belongs to different gene families [95]. However, the presence of these non-functional proteins in the FMO family can only be confirmed through functional studies.
Most orthologs from different Oryza species have similar exon-intron arrangements, again demonstrating the conservation of this gene family. However, the presence of an exceptionally long 3 UTR in OnFMO3 and 5 UTR in OmFMO5 indicate its role in specific developmental processes, as longer 5 UTRs are generally found in genes involved in regulating growth processes in a tissue-specific manner [96]. The FMO-encoding genes with longer introns may be highly expressed as compared to those with shorter or no introns since, in plants, generally, genes with longer intronic regions and UTRs have high transcript abundance [97][98][99]. Also, it has been shown that the presence of both longer 3 UTRs and introns act as cis elements to promote non-sense mediated decay (NMD) which is a quality check mechanism in eukaryotes including plants to eliminate identified aberrant mRNAs with a premature termination codon [100]. Thus, it is possible that OnFMO3, OmFMO5, and ObrFMO2 may be subjected to NMD owing to the presence of longer UTRs and introns in its gene architecture. Different gene structures in some sets of orthologs also indicate that gene duplication in this family has been followed by subsequent diversification [101,102] which can help these FMOs in performing various functions and/or in spatio-temporal regulation. The variable sizes of exon, intron, and UTRs in the FMO gene family implies that genes have undergone extensive shuffling during evolution [103,104].
From domain analysis, we have found that wild rice strains have mostly 2-5 FAD domains, with few members having more than five domains as well, whereas cultivated rice strains have only 1-2 FAD domains. There are three possibilities here. First, all the FAD domains in wild rice species work equally, and to reduce the genetic load on the cell, some of them have been deleted and the complete function has been assigned to a lesser number of domains in the cultivated rice. Second, only 1 or 2 domains are actually functional and are sufficient for the function of an FMO, and so the extra copies of this domain have been deleted. Another possibility is that these domains may be split in wild rice and together, they make functional domains during protein folding. A recent study has shown that domain gains and losses occur frequently during proteome evolution and concurrently with the evolution of cells. Fascinatingly, the number of gains tends to exceed the losses in the proteome, which could be to redefine the survival strategy of an organism [105]. In fact, it has been shown that higher eukaryotes trade-off the organismal budget for possessing the unique number of genes and domain architecture (economy), with the ability to resist damage and adaptability to environmental change [106]. It seems possible that a sharp reduction in the genetic size, together with a tendency to retain flexibility and robustness during domestication, has resulted in the subsequent loss of the extra repeats of the FAD domain of the FMO family in the cultivated varieties. Furthermore, the presence of nucleotide-binding domains in some FMO proteins clearly indicates that some members of this family play a role in pathogen defense, as this domain aids in pathogen recognition and signalling [107][108][109][110].
From the in silico expression analysis of available FMO genes of O. sativa subsp. japonica in different tissues, few genes were found to be highly expressed in specific tissues such as endosperm, root, panicle, spikelet, seedling root and embryo. Expression in a particular tissue indicates their functional specificity. The majority of the genes' expression, however, was consistent across all tissues. Therefore, it can be inferred that these genes might be playing a significant role across all the developmental processes. Expression analysis of FMO genes under different abiotic stresses, as well as validation by qRT-PCR, reveals that the entire family does not respond to these stresses where only a subset of genes get downregulated and some were upregulated under these stresses. Some of the FMO genes belonging to Clade II have been found to be downregulated in drought and heat stress in wheat also [90]. In our study, we also found that few genes are downregulated under salinity stress, thus indicating the possibility that some of the members of the FMO family are not needed in these stresses and probably that its silencing could play a significant role in imparting stress tolerance in rice like SlMYB50 and SlMYB55 genes, the silencing of which enhanced salinity and drought tolerance in tomato [111,112]. However, this data is very preliminary, and detailed spatio-temporal and multiple stress-mediated expression profiling would help us understand the trend better. We believe the findings of this study will aid in the functional characterization of this gene family in rice, opening up new avenues for future research.

Identification of the FMO Family Members in Rice
The HMM (hidden Markov model) search of the FMO domain (PF00743) was retrieved from the Pfam database, (http://pfam.xfam.org/; EMBL-EBI, UK; accessed on 12 July 2022), which was used to identify the full-length protein sequences of FMO in the Gramene [74]

Phylogenetic Analysis
Multiple sequence alignment of the FMO proteins from different Oryza species was performed using Clustal Ω and subsequently, phylogenetic tree was constructed using MEGA 7.0 [76] by the maximum likelihood method with 1000 bootstrap replicates. Tree was visualized using iTOL (https://itol.embl.de/upload.cgi, accessed on 14 July 2022). For subcellular localization, protein sequences were analyzed using the Wolf Psort: Protein subcellular localization predictor (https://wolfpsort.hgc.jp/, accessed on 12 July 2022).

Motif Analysis
The identified FMO sequences were analyzed for the presence of conserved motifs using the multiple expectation maximization for motif elicitation (MEME) program (http://meme-suite.org/, accessed on 27 July 2022), with the default parameters and the maximum number of motifs set to 10.

Evaluation of Domain Architecture of the FMO Family Proteins
The proteins sequences of the genes encoding the FMO family of proteins were analyzed using the SUPERFAMILY database (SUPERFAMILY database of structural and functional protein annotations for all completely sequenced organisms (supfam.org), accessed on 28 July 2022) for SCOP assessment using the default parameters.

Evaluation of the Gene Structure
The architecture of the genes encoding FMO proteins was determined and visualized using the Gene Structure Display Server tool, using genome sequence and CDS as query sequence (Gene Structure Display Server 2.0 (gao-lab.org), accessed on 29 July 2022).

Developmental and Stress-Mediated Expression Profiling of the FMO Encoding Genes
Normalized curated expression profiling data of the FMO encoding genes under different stresses and in the various anatomical parts under different developmental stages were retrieved from the Affymetrix, as well as RNA-seq datasets of the publicly available Genevestigator database (https://genevestigator.com; NEBION, Switzerland; accessed on 4 August 2022). The heatmaps were generated using MeV 4.9.0 application tool.
For qRT-PCR-based expression profiling of selected FMO genes, leaves cut from one month old plants of Oryza sativa subsp. indica and its closest wild relative Oryza nivara, growing under controlled greenhouse conditions, were subjected to salt stress (200 mM NaCl), drought (water withheld) and high temperature (42 • C) for 24 h. Total RNA was isolated using Trizol reagent (ThermoFisher Scientific, Waltham, MA, USA) and first-strand cDNA synthesis done using the RevertAid first-strand cDNA synthesis kit (ThermoFisher Scientific, Waltham, MA, USA). Real-time PCR was performed to the Applied Biosystems 7500 Step-One Instrument (Applied Biosystems 7500, Foster City, CA, USA). The eukaryotic Elongation Factor-1α (eEF-1α) was used as an internal control for normalization in the qRT-PCR. The log 2 expression values of each gene under salinity, heat and drought conditions have been calculated with respect to the untreated control (having log2 value 0) [113].

Conclusions
Rice genome sequencing has substantially sped up the discovery and characterization of breeder's important genes, as well as a new understanding of their evolutionary history. In our study, we have identified the entire complement of the FMO family from 10 different Oryza species including both wild and cultivated. Wild and cultivated rice species have different numbers of FMO genes. Phylogenetic, motif, and gene structure analysis indicate that this family is conserved across these species. The phylogenetic analysis of FMO orthologues in cultivated rice and its wild relatives also reveals that nearly all of the orthologs that shared comparable structural organizations were clustered together with a convincing bootstrap value (1000). Some of the genes from O. brachyantha show a difference in motif and gene structure from other orthologs because it has FF genome and it is the most distant ancestor. Unlike, O. brachyantha, O. punctata which has the BB genome, is different from the rest of the Oryza species with an AA genome and did not show any difference among its orthologs. From domain analysis, we have found that wild rice species have multiple domains which have been subsequently lost during evolution of our cultivated species. Also, the expression of this gene family in different rice tissues is reported in this study. From the in silico expression analysis and validation of selected genes by qRT-PCR in cultivated rice, O. sativa subsp. indica and wild rice O. nivara under different abiotic stresses, it can be concluded that the entire family does not respond to different stresses, implying that they perform diverse functions in the cell. Altogether, this study provides the total number of FMO genes present in rice, which has not been reported till now, and also assesses the similarities and differences in this family across different Oryza species. This information can help researchers in the functional characterization of rice FMOs. The stress responsiveness of some FMOs reported in this study further encourages the researcher to use them for improving stress tolerance in the cultivated rice varieties or neodomestication of CWRs.

Data Availability Statement:
The datasets supporting the conclusions of this article are included within the article and its additional files. The sequence data for all the Oryza species were obtained from Gramene data resource (https://gramene.org (accessed on 12 July 2022). For O. sativa subsp. japonica, the sequences were also retrieved from the RGAP (http://rice.plantbiology.msu.edu/ (accessed on 15 September 2021)).